C++: Speed up cpp/leap-year/unchecked-after-arithmetic-year-modification#21768
C++: Speed up cpp/leap-year/unchecked-after-arithmetic-year-modification#21768
cpp/leap-year/unchecked-after-arithmetic-year-modification#21768Conversation
…er an initial round of the query to reduce the number of sinks.
There was a problem hiding this comment.
Pull request overview
This PR addresses a severe performance regression in the C++ query cpp/leap-year/unchecked-after-arithmetic-year-modification by restructuring the query’s dataflow computations to drastically reduce the number of sources/sinks (and resulting PathNodes) considered during evaluation.
Changes:
- Introduces an initial “pre-filter” dataflow (
OperationToYearAssignmentConfig0/OperationToYearAssignmentFlow0) to identify only operation-source candidates that can reach a year assignment. - Restricts the sinks of
IgnorableOperationToOperationSourceCandidateConfigto only those operation-source candidates that are relevant to year assignments. - Refactors the final
OperationToYearAssignmentConfigto use the restricted set and apply the ignorable-operation filtering in a later stage.
Show a summary per file
| File | Description |
|---|---|
| cpp/ql/src/Likely Bugs/Leap Year/UncheckedLeapYearAfterYearModification.ql | Reorders and splits dataflow configurations to constrain sink/source sets and improve evaluation performance. |
Copilot's findings
- Files reviewed: 1/1 changed files
- Comments generated: 1
…ification.ql Co-authored-by: Jeroen Ketema <93738568+jketema@users.noreply.github.com>
Not really sure I agree with Copilot's definition of "severe" here since this was 1 repository across all of Microsoft, but oh well |
There was a problem hiding this comment.
Looks fine and reasonable, performance is about the same in my local testing but I can see how this has a dramatic effect on certain cases.
I think I've said it before and I'll say it again: this file really needs a diagram of how all the data flows fit together. Its a bit late for this change, but next time I do a review on this query, I might hold you / the author to this.
Not really sure I agree with Copilot's definition of "severe" here since this was 1 repository across all of Microsoft, but oh well
😆
Partly fair. I also have very little idea about how this query works so I'm glad you're not blocking the approval on that 😅 I was only concerned about fixing the performance |
This PR fixes a performance problem in cpp/leap-year/unchecked-after-arithmetic-year-modification which was caused by #21292.
The problem can be seen in this partial run:
The underlying problem is that the dataflow computation specified by
IgnorableOperationToOperationSourceCandidateConfighas way too many sources and sinks. So even though it is restricted to flows where the source and the sink are in the same enclosing callable there is way too manyPathNodes.To fix this problem we modify the ordering of dataflows. On
mainwe do (ignoring some of the other dataflow configurations used in this query):IgnorableOperationToOperationSourceCandidateConfigto identify operations to ignoreOperationSource, which then serve as the sources for the "final" dataflow computation specified byOperationToYearAssignmentConfig.After this PR we do:
OperationToYearAssignmentConfig0(which is equal to the originalOperationToYearAssignmentConfig, but without the filtering done byIgnorableOperationToOperationSourceCandidateConfig)IgnorableOperationToOperationSourceCandidateConfig. This is the same computation as we do onmain, but now restricted to only relevant sinks (which we get from (1))OperationToYearAssignmentConfig(which is equal toOperationToYearAssignmentConfig0, but with the restricted set of sinks from (2))On the particular database I was testing on this now completes "trivially" (since sentinals now detect that the query will have 0 results since
UncheckedLeapYearAfterYearModification::OperationToYearAssignmentFlow::Flow::S6::flowPath/2#9b05e974/2is empty).